A Robust Density-Based Clustering Approach Using DBCURE –MapReduce Techniques
نویسنده
چکیده
Clustering is the process of grouping similar data into clusters and dissimilar data into different clusters. Density-based clustering is a useful clustering approach such as DBSCAN and OPTICS. The increasing volume of data and varying size of data sets lead the clustering process challenging. So that we propose a parallel framework of clustering with advanced approach called MapReduce. We develop a new algorithm DBCURE which is modified from DBSCAN. Next we propose a newly developed parallel algorithm called DBCURE-MR that is DBCURE algorithm is parallelized with MapReduce framework. This proposed method finds clusters in parallel manner efficiently. Keywords— MapReduce, DBCURE, density based clustering, parallelization algorithm.
منابع مشابه
Bilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملMapReduce K-Means based Co-Clustering Approach for Web Page Recommendation System
Co-clustering is one of the data mining techniques used for web usage mining. Co-clustering Web log data is the process of simultaneous categorization of both users and pages. It is used to extract the users’ information based on subset of pages. Nowadays, the cyberspace is filled with huge volume of data distributed across the world. The business knowledge acquaintance from such a voluminous d...
متن کاملA density based clustering approach to distinguish between web robot and human requests to a web server
Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...
متن کاملA Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm
Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...
متن کاملExtraction of Respiratory Signal Based on Image Clustering and Intensity Parameters at Radiotherapy with External Beam: A Comparative Study
Background: Since tumors located in thorax region of body mainly move due to respiration, in the modern radiotherapy, there have been many attempts such as; external markers, strain gage and spirometer represent for monitoring patients’ breathing signal. With the advent of fluoroscopy technique, indirect methods were proposed as an alternative approach to extract patients’ breathing signals...
متن کامل